Hierarchical clustering analysis of reading aloud data: a new technique for evaluating the performance of computational models
نویسندگان
چکیده
DRC (Coltheart et al., 2001) and CDP++ (Perry et al., 2010) are two of the most successful models of reading aloud. These models differ primarily in how their sublexical systems convert letter strings into phonological codes. DRC adopts a set of grapheme-to-phoneme conversion rules (GPCs) while CDP++ uses a simple trained network that has been exposed to a combination of rules and the spellings and pronunciations of known words. Thus far the debate between fixed rules and learned associations has largely emphasized reaction time experiments, error rates in dyslexias, and item-level variance from large-scale databases. Recently, Pritchard et al. (2012) examined the models' non-word reading in a new way. They compared responses produced by the models to those produced by 45 skilled readers. Their item-by-item analysis is informative, but leaves open some questions that can be addressed with a different technique. Using hierarchical clustering techniques, we first examined the subject data to identify if there are classes of subjects that are similar to each other in their overall response profiles. We found that there are indeed two groups of subject that differ in their pronunciations for certain consonant clusters. We also tested the possibility that CDP++ is modeling one set of subjects well, while DRC is modeling a different set of subjects. We found that CDP++ does not fit any human reader's response pattern very well, while DRC fits the human readers as well as or better than any other reader.
منابع مشابه
Content Analysis of Infographics with the Theme of Reading Based on the Lasswell's Communication Model
Purpose: Infographics are considered a powerful communication and information medium in human-information interaction as well as in successful transmission of messages. The aim of the present study is to arrive at a model or framework based on the content of infographics published about reading, to discover and identify their content and also to introduce the capabilities and attractiveness of ...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملبه کارگیری روشهای خوشهبندی در ریزآرایه DNA
Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...
متن کاملChoosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation
1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...
متن کاملA Clustering Based Location-allocation Problem Considering Transportation Costs and Statistical Properties (RESEARCH NOTE)
Cluster analysis is a useful technique in multivariate statistical analysis. Different types of hierarchical cluster analysis and K-means have been used for data analysis in previous studies. However, the K-means algorithm can be improved using some metaheuristics algorithms. In this study, we propose simulated annealing based algorithm for K-means in the clustering analysis which we refer it a...
متن کامل